5,716 research outputs found
Which structures are out there? : Learning predictive compositional concepts based on social sensorimotor explorations
How do we learn to think about our world in a flexible, compositional manner? What is the actual content of a particular thought? How do we become language ready? I argue that free energy-based inference processes, which determine the learning of predictive encodings, need to incorporate additional structural learning biases that reflect those structures of our world that are behaviorally relevant for us. In particular, I argue that the inference processes and thus the resulting predictive encodings should enable (i) the distinction of space from entities, with their perceptually and behaviorally relevant properties, (ii) the flexible, temporary activation of relative spatial relations between different entities, (iii) the dynamic adaptation of the involved, distinct encodings while executing, observing, or imagining particular interactions, and (iv) the development of a – probably motor-grounded – concept of forces, which predictively encodes the results of relative spatial and property manipulations dynamically over time. Furthermore, seeing that entity interactions typically have a beginning and an end, free energy-based inference should be additionally biased towards the segmentation of continuous sensorimotor interactions and sensory experiences into events and event boundaries. Thereby, events may be characterized by particular sets of active predictive encodings. Event boundaries, on the other hand, identify those situational aspects that are critical for the commencement or the termination of a particular event, such as the establishment of object contact and contact release. I argue that the development of predictive event encodings naturally lead to the development of conceptual encodings and the possibility of composing these encodings in a highly flexible, semantic manner. Behavior is generated by means of active inference. The addition of internal motivations in the form of homeostatic variables focusses our behavior – including attention and thought – on those environmental interactions that are motivationally-relevant, thus continuously striving for internal homeostasis in a goal-directed manner. As a consequence, behavior focusses cognitive development towards (believed) bodily and cognitively (including socially) relevant aspects. The capacity to integrate tools and other humans into our minds, as well as the motivation to flexibly interact with them, seem to open up the possibility of assigning roles – such as actors, instruments, and recipients – when observing, executing, or imagining particular environmental interactions. Moreover, in conjunction with predictive event encodings, this tool- and socially-oriented mental flexibilization fosters perspective taking, reasoning, and other forms of mentalizing. Finally, I discuss how these structures and mechanisms are exactly those that seem necessary to make our minds language ready
Recommended from our members
Learning Behavior-Grounded Event Segmentations
The event segmentation theory (EST) postulates that humanssystematically segment the continuous sensorimotor informa-tion flow into events and event boundaries. The basis for theobserved segmentation tendencies, however, remains largelyunknown. We introduce a computational model that groundsEST in the interaction abilities of a system. The model learnsevents and event boundaries based on actively gathered senso-rimotor signals. It segments the signals based on principles ofprobabilistic predictive coding and surprise. The implementedmodel essentially simulates, anticipates, and learns event pro-gressions and event transitions online while interacting withthe environment by means of dynamic, predictive Bayesianmodels. Besides the model’s event segmentation capabilities,we show that the learned encodings can be used for higher-order planning. Moreover, the encodings systematically con-ceptualize environmental interactions and they help to identifythe factors that are critical for ensuring interaction success
Embodied learning of a generative neural model for biological motion perception and inference
Although an action observation network and mirror neurons for understanding the actions and intentions of others have been under deep, interdisciplinary consideration over recent years, it remains largely unknown how the brain manages to map visually perceived biological motion of others onto its own motor system. This paper shows how such a mapping may be established, even if the biologically motion is visually perceived from a new vantage point. We introduce a learning artificial neural network model and evaluate it on full body motion tracking recordings. The model implements an embodied, predictive inference approach. It first learns to correlate and segment multimodal sensory streams of own bodily motion. In doing so, it becomes able to anticipate motion progression, to complete missing modal information, and to self-generate learned motion sequences. When biological motion of another person is observed, this self-knowledge is utilized to recognize similar motion patterns and predict their progress. Due to the relative encodings, the model shows strong robustness in recognition despite observing rather large varieties of body morphology and posture dynamics. By additionally equipping the model with the capability to rotate its visual frame of reference, it is able to deduce the visual perspective onto the observed person, establishing full consistency to the embodied self-motion encodings by means of active inference. In further support of its neuro-cognitive plausibility, we also model typical bistable perceptions when crucial depth information is missing. In sum, the introduced neural model proposes a solution to the problem of how the human brain may establish correspondence between observed bodily motion and its own motor system, thus offering a mechanism that supports the development of mirror neurons
Adaptive learning in a compartmental model of visual cortex—how feedback enables stable category learning and refinement
The categorization of real world objects is often reflected in the similarity of their visual appearances. Such categories of objects do not necessarily form disjunct sets of objects, neither semantically nor visually. The relationship between categories can often be described in terms of a hierarchical structure. For instance, tigers and leopards build two separate mammalian categories, but both belong to the category of felines. In other words, tigers and leopards are subcategories of the category Felidae. In the last decades, the unsupervised learning of categories of visual input stimuli has been addressed by numerous approaches in machine learning as well as in the computational neurosciences. However, the question of what kind of mechanisms might be involved in the process of subcategory learning, or category refinement, remains a topic of active investigation. We propose a recurrent computational network architecture for the unsupervised learning of categorial and subcategorial visual input representations. During learning, the connection strengths of bottom-up weights from input to higher-level category representations are adapted according to the input activity distribution. In a similar manner, top-down weights learn to encode the characteristics of a specific stimulus category. Feedforward and feedback learning in combination realize an associative memory mechanism, enabling the selective top-down propagation of a category's feedback weight distribution. We suggest that the difference between the expected input encoded in the projective field of a category node and the current input pattern controls the amplification of feedforward-driven representations. Large enough differences trigger the recruitment of new representational resources and the establishment of (sub-) category representations. We demonstrate the temporal evolution of such learning and show how the approach successully establishes category and subcategory representations
Advancing Parsimonious Deep Learning Weather Prediction using the HEALPix Mesh
We present a parsimonious deep learning weather prediction model on the
Hierarchical Equal Area isoLatitude Pixelization (HEALPix) to forecast seven
atmospheric variables for arbitrarily long lead times on a global approximately
110 km mesh at 3h time resolution. In comparison to state-of-the-art machine
learning weather forecast models, such as Pangu-Weather and GraphCast, our
DLWP-HPX model uses coarser resolution and far fewer prognostic variables. Yet,
at one-week lead times its skill is only about one day behind the
state-of-the-art numerical weather prediction model from the European Centre
for Medium-Range Weather Forecasts. We report successive forecast improvements
resulting from model design and data-related decisions, such as switching from
the cubed sphere to the HEALPix mesh, inverting the channel depth of the U-Net,
and introducing gated recurrent units (GRU) on each level of the U-Net
hierarchy. The consistent east-west orientation of all cells on the HEALPix
mesh facilitates the development of location-invariant convolution kernels that
are successfully applied to propagate global weather patterns across our
planet. Without any loss of spectral power after two days, the model can be
unrolled autoregressively for hundreds of steps into the future to generate
stable and realistic states of the atmosphere that respect seasonal trends, as
showcased in one-year simulations. Our parsimonious DLWP-HPX model is
research-friendly and potentially well-suited for sub-seasonal and seasonal
forecasting
Spatial memory for vertical locations
Most studies on spatial memory refer to the horizontal plane, leaving an open question as to whether findings generalize to vertical spaces where gravity and the visual upright of our surrounding space are salient orientation cues. In three experiments, we examined which reference frame is used to organize memory for vertical locations: the one based on the body vertical, the visual-room vertical, or the direction of gravity. Participants judged interobject spatial relationships learned from a vertical layout in a virtual room. During learning and testing, we varied the orientation of the participant’s body (upright vs. lying sideways) and the visually presented room relative to gravity (e.g., rotated by 90° along the frontal plane). Across all experiments, participants made quicker or more accurate judgments when the room was oriented in the same way as during learning with respect to their body, irrespective of their orientations relative to gravity. This suggests that participants employed an egocentric body-based reference frame for representing vertical object locations. Our study also revealed an effect of body–gravity alignment during testing. Participants recalled spatial relations more accurately when upright, regardless of the body and visual-room orientation during learning. This finding is consistent with a hypothesis of selection conflict between different reference frames. Overall, our results suggest that a body-based reference frame is preferred over salient allocentric reference frames in memory for vertical locations perceived from a single view. Further, memory of vertical space seems to be tuned to work best in the default upright body orientation
- …